智能论文笔记

Enhancing Deep Learning-based 3-lead ECG Classification with Heartbeat Counting and Demographic Data Integration

Khiem H. Le , Hieu H. Pham , Thao B. T. Nguyen , Tu A. Nguyen , Cuong D. Do

分类：计算机视觉

2022-08-15

如今，越来越多的人被诊断出患有心血管疾病（CVD），这是全球死亡的主要原因。鉴定这些心脏问题的金标准是通过心电图（ECG）。标准的12铅ECG广泛用于临床实践和当前的大多数研究。但是，使用较少的铅可以使ECG更加普遍，因为它可以与便携式或可穿戴设备集成。本文介绍了两种新型技术，以提高当前深度学习系统的3铅ECG分类的性能，从而与使用标准12铅ECG训练的模型相提并论。具体而言，我们提出了一种以心跳回归数量的形式的多任务学习方案，以及将患者人口统计数据整合到系统中的有效机制。随着这两个进步，我们在两个大规模的ECG数据集（即Chapman和CPSC-2018）上以F1分数为0.9796和0.8140的分类性能，这些数据分别超过了当前最新的ECG分类方法，该方法超过了当前的ECG分类方法。甚至那些接受了12条铅数据的培训。为了鼓励进一步开发，我们的源代码可在https://github.com/lhkhiem28/lightx3ecg上公开获得。

translated by 谷歌翻译

Detecting COVID-19 from digitized ECG printouts using 1D convolutional neural networks

Thao Nguyen , Hieu H. Pham , Huy Khiem Le , Anh Tu Nguyen , Ngoc Tien Thanh , Cuong Do

分类：计算机视觉

2022-08-10

COVID-19大流行已经暴露了全球医疗服务的脆弱性，增加了开发新颖的工具来提供快速且具有成本效益的筛查和诊断的需求。临床报告表明，Covid-19感染可能导致心脏损伤，心电图（ECG）可以作为Covid-19的诊断生物标志物。这项研究旨在利用ECG信号自动检测COVID-19。我们提出了一种从ECG纸记录中提取ECG信号的新方法，然后将其送入一维卷积神经网络（1D-CNN）中，以学习和诊断疾病。为了评估数字信号的质量，标记了基于纸张的ECG图像中的R峰。之后，将从每个图像计算的RR间隔与相应数字化信号的RR间隔进行比较。 COVID-19 ECG图像数据集上的实验表明，提出的数字化方法能够正确捕获原始信号，平均绝对误差为28.11 ms。我们提出的1D-CNN模型在数字化的心电图信号上进行了训练，允许准确识别患有COVID-19和其他受试者的个体，分类精度为98.42％，95.63％和98.50％，用于分类COVID-19 vs.正常，与正常人分类， COVID-19与异常心跳和Covid-19和其他类别分别与其他阶级。此外，提出的方法还为多分类任务实现了高级的性能。我们的发现表明，经过数字化的心电图信号训练的深度学习系统可以作为诊断Covid-19的潜在工具。

translated by 谷歌翻译

LightX3ECG: A Lightweight and eXplainable Deep Learning System for 3-lead Electrocardiogram Classification

Khiem H. Le , Hieu H. Pham , Thao BT. Nguyen , Tu A. Nguyen , Tien N. Thanh , Cuong D. Do

分类：计算机视觉 | 人工智能

2022-07-25

心血管疾病（CVD）是一组心脏和血管疾病，是对人类健康最严重的危险之一，此类患者的数量仍在增长。早期，准确的检测在成功治疗和干预中起着关键作用。心电图（ECG）是识别各种心血管异常的金标准。在临床实践和当前大多数研究中，主要使用标准的12铅ECG。但是，使用较少的铅可以使ECG更加普遍，因为可以通过便携式或可穿戴设备来方便地记录它。在这项研究中，我们开发了一种新颖的深度学习系统，以仅使用三个ECG铅来准确识别多个心血管异常。

translated by 谷歌翻译

Developing an AI-enabled IIoT platform -- Lessons learned from early use case validation

Holger Eichelberger , Gregory Palmer , Svenja Reimer , Tat Trong Vu , Hieu Do , Sofiane Laridi , Alexander Weber , Claudia Niederée , Thomas Hildebrandt

分类：人工智能

2022-07-10

为了在工业生产中更广泛地采用AI，足够的基础设施能力至关重要。这包括简化AI与工业设备的集成，对分布式部署，监视和一致的系统配置的支持。现有的IIOT平台仍然缺乏以开放的，基于生态系统的方式灵活整合可重复使用的AI服务和相关标准（例如资产管理壳或OPC UA）的功能。这正是我们采用高度可配置的基于低代码的方法来解决我们下一个级别的智能工业生产生产生产Ecosphere（IIP-Ecosphere）平台所解决的问题。在本文中，我们介绍了该平台的设计，并根据启用AI支持的视觉质量检查的演示者讨论了早期评估。在这项早期评估活动中，学到的见解和教训补充了这一点。

translated by 谷歌翻译

VSEC: Transformer-based Model for Vietnamese Spelling Correction

Dinh-Truong Do , Ha Thanh Nguyen , Thang Ngoc Bui , Dinh Hieu Vo

分类：自然语言处理

2021-11-01

拼写错误纠正是自然语言处理中具有很长历史的主题之一。虽然以前的研究取得了显着的结果，但仍然存在挑战。在越南语中，任务的最先进的方法从其相邻音节中介绍了一个音节的上下文。然而，该方法的准确性可能是不令人满意的，因为如果模型可能会失去上下文，如果两个（或更多）拼写错误彼此静置。在本文中，我们提出了一种纠正越南拼写错误的新方法。我们使用深入学习模型解决错误错误和拼写错误错误的问题。特别地，嵌入层由字节对编码技术提供支持。基于变压器架构的序列模型的序列使我们的方法与上一个问题不同于同一问题的方法。在实验中，我们用大型合成数据集训练模型，这是随机引入的拼写错误。我们使用现实数据集测试所提出的方法的性能。此数据集包含11,202个以9,341不同的越南句子中的人造拼写错误。实验结果表明，我们的方法达到了令人鼓舞的表现，检测到86.8％的误差，81.5％纠正，分别提高了最先进的方法5.6％和2.2％。

translated by 谷歌翻译

Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?

Wenhao Wu , Haipeng Luo , Bo Fang , Jingdong Wang , Wanli Ouyang

分类：计算机视觉

2022-12-31

Most existing text-video retrieval methods focus on cross-modal matching between the visual content of offline videos and textual query sentences. However, in real scenarios, online videos are frequently accompanied by relevant text information such as titles, tags, and even subtitles, which can be utilized to match textual queries. This inspires us to generate associated captions from offline videos to help with existing text-video retrieval methods. To do so, we propose to use the zero-shot video captioner with knowledge of pre-trained web-scale models (e.g., CLIP and GPT-2) to generate captions for offline videos without any training. Given the captions, one question naturally arises: what can auxiliary captions do for text-video retrieval? In this paper, we present a novel framework Cap4Video, which makes use of captions from three aspects: i) Input data: The video and captions can form new video-caption pairs as data augmentation for training. ii) Feature interaction: We perform feature interaction between video and caption to yield enhanced video representations. iii) Output score: The Query-Caption matching branch can be complementary to the original Query-Video matching branch for text-video retrieval. We conduct thorough ablation studies to demonstrate the effectiveness of our method. Without any post-processing, our Cap4Video achieves state-of-the-art performance on MSR-VTT (51.4%), VATEX (66.6%), MSVD (51.8%), and DiDeMo (52.0%).

translated by 谷歌翻译

Do Bayesian Variational Autoencoders Know What They Don't Know?

Misha Glazunov , Apostolis Zarras

分类： (统计)机器学习 | 机器学习

2022-12-29

The problem of detecting the Out-of-Distribution (OoD) inputs is of paramount importance for Deep Neural Networks. It has been previously shown that even Deep Generative Models that allow estimating the density of the inputs may not be reliable and often tend to make over-confident predictions for OoDs, assigning to them a higher density than to the in-distribution data. This over-confidence in a single model can be potentially mitigated with Bayesian inference over the model parameters that take into account epistemic uncertainty. This paper investigates three approaches to Bayesian inference: stochastic gradient Markov chain Monte Carlo, Bayes by Backpropagation, and Stochastic Weight Averaging-Gaussian. The inference is implemented over the weights of the deep neural networks that parameterize the likelihood of the Variational Autoencoder. We empirically evaluate the approaches against several benchmarks that are often used for OoD detection: estimation of the marginal likelihood utilizing sampled model ensemble, typicality test, disagreement score, and Watanabe-Akaike Information Criterion. Finally, we introduce two simple scores that demonstrate the state-of-the-art performance.

translated by 谷歌翻译

How Do Deepfakes Move? Motion Magnification for Deepfake Source Detection

Umur Aybars Ciftci , Ilke Demir

分类：计算机视觉 | 人工智能

2022-12-28

With the proliferation of deep generative models, deepfakes are improving in quality and quantity everyday. However, there are subtle authenticity signals in pristine videos, not replicated by SOTA GANs. We contrast the movement in deepfakes and authentic videos by motion magnification towards building a generalized deepfake source detector. The sub-muscular motion in faces has different interpretations per different generative models which is reflected in their generative residue. Our approach exploits the difference between real motion and the amplified GAN fingerprints, by combining deep and traditional motion magnification, to detect whether a video is fake and its source generator if so. Evaluating our approach on two multi-source datasets, we obtain 97.17% and 94.03% for video source detection. We compare against the prior deepfake source detector and other complex architectures. We also analyze the importance of magnification amount, phase extraction window, backbone network architecture, sample counts, and sample lengths. Finally, we report our results for different skin tones to assess the bias.

translated by 谷歌翻译

Don't do it: Safer Reinforcement Learning With Rule-based Guidance

Ekaterina Nikonova , Cheng Xue , Jochen Renz

分类：人工智能

2022-12-28

During training, reinforcement learning systems interact with the world without considering the safety of their actions. When deployed into the real world, such systems can be dangerous and cause harm to their surroundings. Often, dangerous situations can be mitigated by defining a set of rules that the system should not violate under any conditions. For example, in robot navigation, one safety rule would be to avoid colliding with surrounding objects and people. In this work, we define safety rules in terms of the relationships between the agent and objects and use them to prevent reinforcement learning systems from performing potentially harmful actions. We propose a new safe epsilon-greedy algorithm that uses safety rules to override agents' actions if they are considered to be unsafe. In our experiments, we show that a safe epsilon-greedy policy significantly increases the safety of the agent during training, improves the learning efficiency resulting in much faster convergence, and achieves better performance than the base model.

translated by 谷歌翻译

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Vikas Verma , Sarthak Mittal , Wai Hoh Tang , Hieu Pham , Juho Kannala , Yoshua Bengio , Arno Solin , Kenji Kawaguchi

分类：机器学习 | 计算机视觉

2022-12-27

Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. We then propose a new method to improve Mixup based on the novel insight. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across various datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.

translated by 谷歌翻译